Search CORE

78 research outputs found

Dynamic Appearance: A Video Representation for Action Recognition with Joint Training

Author: Bors Adrian G.
Huang Guoxi
Publication venue
Publication date: 23/11/2022
Field of study

Static appearance of video may impede the ability of a deep neural network to learn motion-relevant features in video action recognition. In this paper, we introduce a new concept, Dynamic Appearance (DA), summarizing the appearance information relating to movement in a video while filtering out the static information considered unrelated to motion. We consider distilling the dynamic appearance from raw video data as a means of efficient video understanding. To this end, we propose the Pixel-Wise Temporal Projection (PWTP), which projects the static appearance of a video into a subspace within its original vector space, while the dynamic appearance is encoded in the projection residual describing a special motion pattern. Moreover, we integrate the PWTP module with a CNN or Transformer into an end-to-end training framework, which is optimized by utilizing multi-objective optimization algorithms. We provide extensive experimental results on four action recognition benchmarks: Kinetics400, Something-Something V1, UCF101 and HMDB51

arXiv.org e-Print Archive

Learning an evolved mixture model for task-free continual learning

Author: Bors Adrian G.
Ye Fei
Publication venue
Publication date: 11/07/2022
Field of study

Recently, continual learning (CL) has gained significant interest because it enables deep learning models to acquire new knowledge without forgetting previously learnt information. However, most existing works require knowing the task identities and boundaries, which is not realistic in a real context. In this paper, we address a more challenging and realistic setting in CL, namely the Task-Free Continual Learning (TFCL) in which a model is trained on non-stationary data streams with no explicit task information. To address TFCL, we introduce an evolved mixture model whose network architecture is dynamically expanded to adapt to the data distribution shift. We implement this expansion mechanism by evaluating the probability distance between the knowledge stored in each mixture model component and the current memory buffer using the Hilbert Schmidt Independence Criterion (HSIC). We further introduce two simple dropout mechanisms to selectively remove stored examples in order to avoid memory overload while preserving memory diversity. Empirical results demonstrate that the proposed approach achieves excellent performance.Comment: Accepted by the 29th IEEE International Conference on Image Processing (ICIP 2022

arXiv.org e-Print Archive

White Rose Research Online

Binary morphological shape-based interpolation applied to 3-D tooth reconstruction

Author: A. G. Bors
Adrian G. Borg
I. Pitas
Ioannis Pitas
L. Kechagias
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

In this paper we propose an interpolation algorithm using a mathematical morphology morphing approach. The aim of this algorithm is to reconstruct the

n

-dimensional object from a group of (n-1)-dimensional sets representing sections of that object. The morphing transformation modifies pairs of consecutive sets such that they approach in shape and size. The interpolated set is achieved when the two consecutive sets are made idempotent by the morphing transformation. We prove the convergence of the morphological morphing. The entire object is modeled by successively interpolating a certain number of intermediary sets between each two consecutive given sets. We apply the interpolation algorithm for 3-D tooth reconstruction

CiteSeerX

Crossref

White Rose Research Online

Masked Image Residual Learning for Scaling Deeper Vision Transformers

Author: Bors Adrian G.
Fu Hongtao
Huang Guoxi
Publication venue
Publication date: 15/11/2023
Field of study

Deeper Vision Transformers (ViTs) are more challenging to train. We expose a degradation problem in deeper layers of ViT when using masked image modeling (MIM) for pre-training. To ease the training of deeper ViTs, we introduce a self-supervised learning framework called Masked Image Residual Learning (MIRL), which significantly alleviates the degradation problem, making scaling ViT along depth a promising direction for performance upgrade. We reformulate the pre-training objective for deeper layers of ViT as learning to recover the residual of the masked image. We provide extensive empirical evidence showing that deeper ViTs can be effectively optimized using MIRL and easily gain accuracy from increased depth. With the same level of computational complexity as ViT-Base and ViT-Large, we instantiate 4.5

\times

and 2

\times

deeper ViTs, dubbed ViT-S-54 and ViT-B-48. The deeper ViT-S-54, costing 3

\times

less than ViT-Large, achieves performance on par with ViT-Large. ViT-B-48 achieves 86.2% top-1 accuracy on ImageNet. On one hand, deeper ViTs pre-trained with MIRL exhibit excellent generalization capabilities on downstream tasks, such as object detection and semantic segmentation. On the other hand, MIRL demonstrates high pre-training efficiency. With less pre-training time, MIRL yields competitive performance compared to other approaches

arXiv.org e-Print Archive

Defining Image Memorability using the Visual Memory Schema

Author: Akagunduz Erdem
Bors Adrian G.
Evans Karla K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/03/2019
Field of study

Memorability of an image is a characteristic determined by the human observers’ ability to remember images they have seen. Yet recent work on image memorability defines it as an intrinsic property that can be obtained independent of the observer. The current study aims to enhance our understanding and prediction of image memorability, improving upon existing approaches by incorporating the properties of cumulative human annotations. We propose a new concept called the Visual Memory Schema (VMS) referring to an organization of image components human observers share when encoding and recognizing images. The concept of VMS is operationalised by asking human observers to define memorable regions of images they were asked to remember during an episodic memory test. We then statistically assess the consistency of VMSs across observers for either correctly or incorrectly recognised images. The associations of the VMSs with eye fixations and saliency are analysed separately as well. Lastly, we adapt various deep learning architectures for the reconstruction and prediction of memorable regions in images and analyse the results when using transfer learning at the outputs of different convolutional network layers

arXiv.org e-Print Archive

Crossref

White Rose Research Online

OpenMETU (Middle East Technical University)

Steganalysis of 3D objects using statistics of local feature sets

Author: Abdulrahman
Adrian G. Bors
Alface
Bogomjakov
Bollobás
Bors
Bors
Cayre
Chao
Chen
Cho
Cogranne
Denemark
Ding
Duda
Fridrich
Hall
Hong
Itier
Kannala
Kodovsky
Kodovskỳ
Krzanowski
Lavoué
Li
Li
Li
Luo
Max
Mikolajczyk
Ohbuchi
Ren
Ren
Rugis
Rusinkiewicz
Sun
Tang
Tasdemir
Taubin
Tombari
Vieira
Wang
Wang
Wang
Wang
Yan
Yang
Yang
Yang
Yang
Yu
Yu
Yu
Zhenyu Li
Zhou
Publication venue: 'Elsevier BV'
Publication date: 23/02/2017
Field of study

3D steganalysis aims to identify subtle invisible changes produced in graphical objects through digital watermarking or steganography. Sets of statistical representations of 3D features, extracted from both cover and stego 3D mesh objects, are used as inputs into machine learning classifiers in order to decide whether any information was hidden in the given graphical object. The features proposed in this paper include those representing the local object curvature, vertex normals, the local geometry representation in the spherical coordinate system. The effectiveness of these features is tested in various combinations with other features used for 3D steganalysis. The relevance of each feature for 3D steganalysis is assessed using the Pearson correlation coefficient. Six different 3D watermarking and steganographic methods are used for creating the stego-objects used in the evaluation study

arXiv.org e-Print Archive

Crossref

White Rose Research Online

Enhancing reliability and efficiency for real-time robust adaptive steganography using cyclic redundancy check codes

Author: Adrian G. Bors
C Qin
J Tsai
J Tsai
J Ye
M Amini
S Parah
T Filler
V Sedighi
Xiangyang Luo
Xiaodong Zhu
Y Zhang
Y Zhang
Y Zhang
Y Zhang
Y Zhang
YH Kang
Yi Zhang
ZG Qu
Zhenyu Li
ZL Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/08/2019
Field of study

The development of multimedia and deep learning technology bring new challenges to steganography and steganalysis techniques. Meanwhile, robust steganography, as a class of new techniques aiming to solve the problem of covert communication under lossy channels, has become a new research hotspot in the field of information hiding. To improve the communication reliability and efficiency for current real-time robust steganography methods, a concatenated code, composed of Syndrome–Trellis codes (STC) and cyclic redundancy check (CRC) codes, is proposed in this paper. The enhanced robust adaptive steganography framework proposed is this paper is characterized by a strong error detection capability, high coding efficiency, and low embedding costs. On this basis, three adaptive steganographic methods resisting JPEG compression and detection are proposed. Then, the fault tolerance of the proposed steganography methods is analyzed using the residual model of JPEG compression, thus obtaining the appropriate coding parameters. Experimental results show that the proposed methods have a significantly stronger robustness against compression, and are more difficult to be detected by statistical based steganalytic methods

Crossref

White Rose Research Online

watermarking of 3D shapes using localized constraints

Author: Adrian G. Bors
Publication venue
Publication date
Field of study

This paper develops a digital watermarking methodology for 3-D graphical objects defined by polygonal meshes. In watermarking or fingerprinting the aim is to embed a code in a given media without producing identifiable changes to it. One should be able to retrieve the embedded information even after the shape had suffered various modifications. Two blind watermarking techniques applying perturbations onto the local geometry for selected vertices are described in this paper. The proposed methods produce localized changes of vertex locations that do not alter the mesh topology. A study of the effects caused by vertex location modification is provided for a general class of surfaces. The robustness of the proposed algorithms is tested at noise perturbation and object cropping.

CiteSeerX